{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "whjsJasuhstV"
   },
   "source": [
    "<a href=\"https://colab.research.google.com/github/jeffheaton/app_generative_ai/blob/main/t81_559_class_06_1_rag.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "euOZxlIMhstX"
   },
   "source": [
    "# T81-559: Applications of Generative Artificial Intelligence\n",
    "**Module 6: Retrieval-Augmented Generation (RAG)**\n",
    "* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)\n",
    "* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "d4Yov72PhstY"
   },
   "source": [
    "# Module 6 Material\n",
    "\n",
    "* **Part 6.1: Introduction to Retrieval-Augmented Generation (RAG)** [[Video]](https://www.youtube.com/watch?v=qA52K0K181Q) [[Notebook]](t81_559_class_06_1_rag.ipydb)\n",
    "* Part 6.2: Introduction to ChromaDB [[Video]](https://www.youtube.com/watch?v=R53lo4sevLQ) [[Notebook]](t81_559_class_06_2_chromadb.ipynb)\n",
    "* Part 6.3: Understanding Embeddings [[Video]](https://www.youtube.com/watch?v=Tq82Gl2ZZNM) [[Notebook]](t81_559_class_06_3_embeddings.ipynb)\n",
    "* Part 6.4: Question Answering Over Documents [[Video]](https://www.youtube.com/watch?v=hCwL_lW-gP0) [[Notebook]](t81_559_class_06_4_qa.ipynb)\n",
    "* Part 6.5: Embedding Databases [[Video]](https://www.youtube.com/watch?v=BG2gT4uYxhM) [[Notebook]](t81_559_class_06_5_embed_db.ipynb)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "AcAUP0c3hstY"
   },
   "source": [
    "# Google CoLab Instructions\n",
    "\n",
    "The following code ensures that Google CoLab is running and maps Google Drive if needed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "xsI496h5hstZ",
    "outputId": "6f63a902-7bc2-4021-b790-03b9a6ad2908"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Note: using Google CoLab\n",
      "Requirement already satisfied: langchain in /usr/local/lib/python3.10/dist-packages (0.2.5)\n",
      "Requirement already satisfied: langchain_openai in /usr/local/lib/python3.10/dist-packages (0.1.8)\n",
      "Collecting langchain-community\n",
      "  Downloading langchain_community-0.2.5-py3-none-any.whl (2.2 MB)\n",
      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.2/2.2 MB\u001b[0m \u001b[31m21.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
      "\u001b[?25hRequirement already satisfied: pypdf in /usr/local/lib/python3.10/dist-packages (4.2.0)\n",
      "Requirement already satisfied: chromadb in /usr/local/lib/python3.10/dist-packages (0.5.3)\n",
      "Requirement already satisfied: PyYAML>=5.3 in /usr/local/lib/python3.10/dist-packages (from langchain) (6.0.1)\n",
      "Requirement already satisfied: SQLAlchemy<3,>=1.4 in /usr/local/lib/python3.10/dist-packages (from langchain) (2.0.30)\n",
      "Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /usr/local/lib/python3.10/dist-packages (from langchain) (3.9.5)\n",
      "Requirement already satisfied: async-timeout<5.0.0,>=4.0.0 in /usr/local/lib/python3.10/dist-packages (from langchain) (4.0.3)\n",
      "Requirement already satisfied: langchain-core<0.3.0,>=0.2.7 in /usr/local/lib/python3.10/dist-packages (from langchain) (0.2.9)\n",
      "Requirement already satisfied: langchain-text-splitters<0.3.0,>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from langchain) (0.2.1)\n",
      "Requirement already satisfied: langsmith<0.2.0,>=0.1.17 in /usr/local/lib/python3.10/dist-packages (from langchain) (0.1.81)\n",
      "Requirement already satisfied: numpy<2,>=1 in /usr/local/lib/python3.10/dist-packages (from langchain) (1.25.2)\n",
      "Requirement already satisfied: pydantic<3,>=1 in /usr/local/lib/python3.10/dist-packages (from langchain) (2.7.4)\n",
      "Requirement already satisfied: requests<3,>=2 in /usr/local/lib/python3.10/dist-packages (from langchain) (2.31.0)\n",
      "Requirement already satisfied: tenacity<9.0.0,>=8.1.0 in /usr/local/lib/python3.10/dist-packages (from langchain) (8.4.1)\n",
      "Requirement already satisfied: openai<2.0.0,>=1.26.0 in /usr/local/lib/python3.10/dist-packages (from langchain_openai) (1.35.3)\n",
      "Requirement already satisfied: tiktoken<1,>=0.7 in /usr/local/lib/python3.10/dist-packages (from langchain_openai) (0.7.0)\n",
      "Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)\n",
      "  Downloading dataclasses_json-0.6.7-py3-none-any.whl (28 kB)\n",
      "Requirement already satisfied: typing_extensions>=4.0 in /usr/local/lib/python3.10/dist-packages (from pypdf) (4.12.2)\n",
      "Requirement already satisfied: build>=1.0.3 in /usr/local/lib/python3.10/dist-packages (from chromadb) (1.2.1)\n",
      "Requirement already satisfied: chroma-hnswlib==0.7.3 in /usr/local/lib/python3.10/dist-packages (from chromadb) (0.7.3)\n",
      "Requirement already satisfied: fastapi>=0.95.2 in /usr/local/lib/python3.10/dist-packages (from chromadb) (0.111.0)\n",
      "Requirement already satisfied: uvicorn[standard]>=0.18.3 in /usr/local/lib/python3.10/dist-packages (from chromadb) (0.30.1)\n",
      "Requirement already satisfied: posthog>=2.4.0 in /usr/local/lib/python3.10/dist-packages (from chromadb) (3.5.0)\n",
      "Requirement already satisfied: onnxruntime>=1.14.1 in /usr/local/lib/python3.10/dist-packages (from chromadb) (1.18.0)\n",
      "Requirement already satisfied: opentelemetry-api>=1.2.0 in /usr/local/lib/python3.10/dist-packages (from chromadb) (1.25.0)\n",
      "Requirement already satisfied: opentelemetry-exporter-otlp-proto-grpc>=1.2.0 in /usr/local/lib/python3.10/dist-packages (from chromadb) (1.25.0)\n",
      "Requirement already satisfied: opentelemetry-instrumentation-fastapi>=0.41b0 in /usr/local/lib/python3.10/dist-packages (from chromadb) (0.46b0)\n",
      "Requirement already satisfied: opentelemetry-sdk>=1.2.0 in /usr/local/lib/python3.10/dist-packages (from chromadb) (1.25.0)\n",
      "Requirement already satisfied: tokenizers>=0.13.2 in /usr/local/lib/python3.10/dist-packages (from chromadb) (0.19.1)\n",
      "Requirement already satisfied: pypika>=0.48.9 in /usr/local/lib/python3.10/dist-packages (from chromadb) (0.48.9)\n",
      "Requirement already satisfied: tqdm>=4.65.0 in /usr/local/lib/python3.10/dist-packages (from chromadb) (4.66.4)\n",
      "Requirement already satisfied: overrides>=7.3.1 in /usr/local/lib/python3.10/dist-packages (from chromadb) (7.7.0)\n",
      "Requirement already satisfied: importlib-resources in /usr/local/lib/python3.10/dist-packages (from chromadb) (6.4.0)\n",
      "Requirement already satisfied: grpcio>=1.58.0 in /usr/local/lib/python3.10/dist-packages (from chromadb) (1.64.1)\n",
      "Requirement already satisfied: bcrypt>=4.0.1 in /usr/local/lib/python3.10/dist-packages (from chromadb) (4.1.3)\n",
      "Requirement already satisfied: typer>=0.9.0 in /usr/local/lib/python3.10/dist-packages (from chromadb) (0.12.3)\n",
      "Requirement already satisfied: kubernetes>=28.1.0 in /usr/local/lib/python3.10/dist-packages (from chromadb) (30.1.0)\n",
      "Requirement already satisfied: mmh3>=4.0.1 in /usr/local/lib/python3.10/dist-packages (from chromadb) (4.1.0)\n",
      "Requirement already satisfied: orjson>=3.9.12 in /usr/local/lib/python3.10/dist-packages (from chromadb) (3.10.5)\n",
      "Requirement already satisfied: httpx>=0.27.0 in /usr/local/lib/python3.10/dist-packages (from chromadb) (0.27.0)\n",
      "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.3.1)\n",
      "Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (23.2.0)\n",
      "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.4.1)\n",
      "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (6.0.5)\n",
      "Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.9.4)\n",
      "Requirement already satisfied: packaging>=19.1 in /usr/local/lib/python3.10/dist-packages (from build>=1.0.3->chromadb) (24.1)\n",
      "Requirement already satisfied: pyproject_hooks in /usr/local/lib/python3.10/dist-packages (from build>=1.0.3->chromadb) (1.1.0)\n",
      "Requirement already satisfied: tomli>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from build>=1.0.3->chromadb) (2.0.1)\n",
      "Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)\n",
      "  Downloading marshmallow-3.21.3-py3-none-any.whl (49 kB)\n",
      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m49.2/49.2 kB\u001b[0m \u001b[31m5.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
      "\u001b[?25hCollecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)\n",
      "  Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB)\n",
      "Requirement already satisfied: starlette<0.38.0,>=0.37.2 in /usr/local/lib/python3.10/dist-packages (from fastapi>=0.95.2->chromadb) (0.37.2)\n",
      "Requirement already satisfied: fastapi-cli>=0.0.2 in /usr/local/lib/python3.10/dist-packages (from fastapi>=0.95.2->chromadb) (0.0.4)\n",
      "Requirement already satisfied: jinja2>=2.11.2 in /usr/local/lib/python3.10/dist-packages (from fastapi>=0.95.2->chromadb) (3.1.4)\n",
      "Requirement already satisfied: python-multipart>=0.0.7 in /usr/local/lib/python3.10/dist-packages (from fastapi>=0.95.2->chromadb) (0.0.9)\n",
      "Requirement already satisfied: ujson!=4.0.2,!=4.1.0,!=4.2.0,!=4.3.0,!=5.0.0,!=5.1.0,>=4.0.1 in /usr/local/lib/python3.10/dist-packages (from fastapi>=0.95.2->chromadb) (5.10.0)\n",
      "Requirement already satisfied: email_validator>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from fastapi>=0.95.2->chromadb) (2.2.0)\n",
      "Requirement already satisfied: anyio in /usr/local/lib/python3.10/dist-packages (from httpx>=0.27.0->chromadb) (3.7.1)\n",
      "Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx>=0.27.0->chromadb) (2024.6.2)\n",
      "Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.10/dist-packages (from httpx>=0.27.0->chromadb) (1.0.5)\n",
      "Requirement already satisfied: idna in /usr/local/lib/python3.10/dist-packages (from httpx>=0.27.0->chromadb) (3.7)\n",
      "Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from httpx>=0.27.0->chromadb) (1.3.1)\n",
      "Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore==1.*->httpx>=0.27.0->chromadb) (0.14.0)\n",
      "Requirement already satisfied: six>=1.9.0 in /usr/local/lib/python3.10/dist-packages (from kubernetes>=28.1.0->chromadb) (1.16.0)\n",
      "Requirement already satisfied: python-dateutil>=2.5.3 in /usr/local/lib/python3.10/dist-packages (from kubernetes>=28.1.0->chromadb) (2.8.2)\n",
      "Requirement already satisfied: google-auth>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from kubernetes>=28.1.0->chromadb) (2.27.0)\n",
      "Requirement already satisfied: websocket-client!=0.40.0,!=0.41.*,!=0.42.*,>=0.32.0 in /usr/local/lib/python3.10/dist-packages (from kubernetes>=28.1.0->chromadb) (1.8.0)\n",
      "Requirement already satisfied: requests-oauthlib in /usr/local/lib/python3.10/dist-packages (from kubernetes>=28.1.0->chromadb) (1.3.1)\n",
      "Requirement already satisfied: oauthlib>=3.2.2 in /usr/local/lib/python3.10/dist-packages (from kubernetes>=28.1.0->chromadb) (3.2.2)\n",
      "Requirement already satisfied: urllib3>=1.24.2 in /usr/local/lib/python3.10/dist-packages (from kubernetes>=28.1.0->chromadb) (2.0.7)\n",
      "Requirement already satisfied: jsonpatch<2.0,>=1.33 in /usr/local/lib/python3.10/dist-packages (from langchain-core<0.3.0,>=0.2.7->langchain) (1.33)\n",
      "Requirement already satisfied: coloredlogs in /usr/local/lib/python3.10/dist-packages (from onnxruntime>=1.14.1->chromadb) (15.0.1)\n",
      "Requirement already satisfied: flatbuffers in /usr/local/lib/python3.10/dist-packages (from onnxruntime>=1.14.1->chromadb) (24.3.25)\n",
      "Requirement already satisfied: protobuf in /usr/local/lib/python3.10/dist-packages (from onnxruntime>=1.14.1->chromadb) (3.20.3)\n",
      "Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from onnxruntime>=1.14.1->chromadb) (1.12.1)\n",
      "Requirement already satisfied: distro<2,>=1.7.0 in /usr/lib/python3/dist-packages (from openai<2.0.0,>=1.26.0->langchain_openai) (1.7.0)\n",
      "Requirement already satisfied: deprecated>=1.2.6 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-api>=1.2.0->chromadb) (1.2.14)\n",
      "Requirement already satisfied: importlib-metadata<=7.1,>=6.0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-api>=1.2.0->chromadb) (7.1.0)\n",
      "Requirement already satisfied: googleapis-common-protos~=1.52 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb) (1.63.1)\n",
      "Requirement already satisfied: opentelemetry-exporter-otlp-proto-common==1.25.0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb) (1.25.0)\n",
      "Requirement already satisfied: opentelemetry-proto==1.25.0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb) (1.25.0)\n",
      "Requirement already satisfied: opentelemetry-instrumentation-asgi==0.46b0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (0.46b0)\n",
      "Requirement already satisfied: opentelemetry-instrumentation==0.46b0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (0.46b0)\n",
      "Requirement already satisfied: opentelemetry-semantic-conventions==0.46b0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (0.46b0)\n",
      "Requirement already satisfied: opentelemetry-util-http==0.46b0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (0.46b0)\n",
      "Requirement already satisfied: setuptools>=16.0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-instrumentation==0.46b0->opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (67.7.2)\n",
      "Requirement already satisfied: wrapt<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-instrumentation==0.46b0->opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (1.14.1)\n",
      "Requirement already satisfied: asgiref~=3.0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-instrumentation-asgi==0.46b0->opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (3.8.1)\n",
      "Requirement already satisfied: monotonic>=1.5 in /usr/local/lib/python3.10/dist-packages (from posthog>=2.4.0->chromadb) (1.6)\n",
      "Requirement already satisfied: backoff>=1.10.0 in /usr/local/lib/python3.10/dist-packages (from posthog>=2.4.0->chromadb) (2.2.1)\n",
      "Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1->langchain) (0.7.0)\n",
      "Requirement already satisfied: pydantic-core==2.18.4 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1->langchain) (2.18.4)\n",
      "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain) (3.3.2)\n",
      "Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from SQLAlchemy<3,>=1.4->langchain) (3.0.3)\n",
      "Requirement already satisfied: regex>=2022.1.18 in /usr/local/lib/python3.10/dist-packages (from tiktoken<1,>=0.7->langchain_openai) (2024.5.15)\n",
      "Requirement already satisfied: huggingface-hub<1.0,>=0.16.4 in /usr/local/lib/python3.10/dist-packages (from tokenizers>=0.13.2->chromadb) (0.23.4)\n",
      "Requirement already satisfied: click>=8.0.0 in /usr/local/lib/python3.10/dist-packages (from typer>=0.9.0->chromadb) (8.1.7)\n",
      "Requirement already satisfied: shellingham>=1.3.0 in /usr/local/lib/python3.10/dist-packages (from typer>=0.9.0->chromadb) (1.5.4)\n",
      "Requirement already satisfied: rich>=10.11.0 in /usr/local/lib/python3.10/dist-packages (from typer>=0.9.0->chromadb) (13.7.1)\n",
      "Requirement already satisfied: httptools>=0.5.0 in /usr/local/lib/python3.10/dist-packages (from uvicorn[standard]>=0.18.3->chromadb) (0.6.1)\n",
      "Requirement already satisfied: python-dotenv>=0.13 in /usr/local/lib/python3.10/dist-packages (from uvicorn[standard]>=0.18.3->chromadb) (1.0.1)\n",
      "Requirement already satisfied: uvloop!=0.15.0,!=0.15.1,>=0.14.0 in /usr/local/lib/python3.10/dist-packages (from uvicorn[standard]>=0.18.3->chromadb) (0.19.0)\n",
      "Requirement already satisfied: watchfiles>=0.13 in /usr/local/lib/python3.10/dist-packages (from uvicorn[standard]>=0.18.3->chromadb) (0.22.0)\n",
      "Requirement already satisfied: websockets>=10.4 in /usr/local/lib/python3.10/dist-packages (from uvicorn[standard]>=0.18.3->chromadb) (12.0)\n",
      "Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio->httpx>=0.27.0->chromadb) (1.2.1)\n",
      "Requirement already satisfied: dnspython>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from email_validator>=2.0.0->fastapi>=0.95.2->chromadb) (2.6.1)\n",
      "Requirement already satisfied: cachetools<6.0,>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb) (5.3.3)\n",
      "Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb) (0.4.0)\n",
      "Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.10/dist-packages (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb) (4.9)\n",
      "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.16.4->tokenizers>=0.13.2->chromadb) (3.15.1)\n",
      "Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.16.4->tokenizers>=0.13.2->chromadb) (2023.6.0)\n",
      "Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.10/dist-packages (from importlib-metadata<=7.1,>=6.0->opentelemetry-api>=1.2.0->chromadb) (3.19.2)\n",
      "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2>=2.11.2->fastapi>=0.95.2->chromadb) (2.1.5)\n",
      "Requirement already satisfied: jsonpointer>=1.9 in /usr/local/lib/python3.10/dist-packages (from jsonpatch<2.0,>=1.33->langchain-core<0.3.0,>=0.2.7->langchain) (3.0.0)\n",
      "Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich>=10.11.0->typer>=0.9.0->chromadb) (3.0.0)\n",
      "Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from rich>=10.11.0->typer>=0.9.0->chromadb) (2.16.1)\n",
      "Collecting mypy-extensions>=0.3.0 (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain-community)\n",
      "  Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)\n",
      "Requirement already satisfied: humanfriendly>=9.1 in /usr/local/lib/python3.10/dist-packages (from coloredlogs->onnxruntime>=1.14.1->chromadb) (10.0)\n",
      "Requirement already satisfied: mpmath<1.4.0,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy->onnxruntime>=1.14.1->chromadb) (1.3.0)\n",
      "Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich>=10.11.0->typer>=0.9.0->chromadb) (0.1.2)\n",
      "Requirement already satisfied: pyasn1<0.7.0,>=0.4.6 in /usr/local/lib/python3.10/dist-packages (from pyasn1-modules>=0.2.1->google-auth>=1.0.1->kubernetes>=28.1.0->chromadb) (0.6.0)\n",
      "Installing collected packages: mypy-extensions, marshmallow, typing-inspect, dataclasses-json, langchain-community\n",
      "Successfully installed dataclasses-json-0.6.7 langchain-community-0.2.5 marshmallow-3.21.3 mypy-extensions-1.0.0 typing-inspect-0.9.0\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "\n",
    "try:\n",
    "    from google.colab import drive, userdata\n",
    "    COLAB = True\n",
    "    print(\"Note: using Google CoLab\")\n",
    "except:\n",
    "    print(\"Note: not using Google CoLab\")\n",
    "    COLAB = False\n",
    "\n",
    "# OpenAI Secrets\n",
    "if COLAB:\n",
    "    os.environ[\"OPENAI_API_KEY\"] = userdata.get('OPENAI_API_KEY')\n",
    "\n",
    "# Install needed libraries in CoLab\n",
    "if COLAB:\n",
    "    !pip install langchain langchain_openai langchain-community pypdf chromadb"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pC9A-LaYhsta"
   },
   "source": [
    "# 6.1: Introduction to Retrieval-Augmented Generation (RAG)\n",
    "\n",
    "Large Language Models (LLMs), like those integrated into LangChain, are powerful tools for processing and understanding large amounts of text. These models can be leveraged to answer questions based on the content of a document, making them highly valuable for tasks that require information extraction and comprehension.\n",
    "\n",
    "One of the techniques used in conjunction with LLMs for document-based question answering is the Retrieval-Augmented Generation (RAG). RAG is a hybrid approach that combines the strengths of information retrieval systems with the generative capabilities of language models. Here's a brief overview of how RAG works:\n",
    "\n",
    "1. **Retrieval Phase:** When a question is posed, the RAG system first retrieves relevant documents or document segments from a large corpus. This retrieval is based on the similarity of the content in the documents to the question. The idea is to find textual evidence that could contain the answer.\n",
    "2. **Augmentation Phase:** The retrieved documents are then fed into a language model as contextual information. This step is crucial as it provides the language model with specific data relevant to the question, which might not be present in the model’s pre-trained knowledge base.\n",
    "3. **Generation Phase:** With the context provided by the retrieved documents, the language model generates a response. The model synthesizes the information from the documents, using its understanding of language and context to formulate a coherent and accurate answer.\n",
    "\n",
    "By combining the retrieval of relevant information with the generative power of LLMs, RAG effectively enhances the model's ability to provide precise answers based on the content of a document. This approach is particularly useful in scenarios where the direct answer to a question may not be readily available in the model's training data, requiring the system to fetch external evidence to support response generation. This makes LangChain's integration of LLMs with RAG a robust solution for document-based question answering, enabling deep understanding and nuanced responses across various domains and types of inquiries.\n",
    "\n",
    "We begin by opening a connection to an OpenAI LLM model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "id": "dfISNTpwOKiZ"
   },
   "outputs": [],
   "source": [
    "from langchain.chains.summarize import load_summarize_chain\n",
    "from langchain.document_loaders import PyPDFLoader, TextLoader\n",
    "from langchain import OpenAI, PromptTemplate\n",
    "from langchain_openai import ChatOpenAI\n",
    "from IPython.display import display_markdown\n",
    "\n",
    "MODEL = 'gpt-4o-mini'\n",
    "\n",
    "llm = ChatOpenAI(\n",
    "        model=MODEL,\n",
    "        temperature=0.2,\n",
    "        n=1\n",
    "    )\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "OLUWbW7NRp4m"
   },
   "source": [
    "We will now load multiple PDFs that we can query with questions. We can ask questions about these documents with information that is not part of the foundation model. We create loaders for each of the PDFs so that we can load them into a vector store for easy querying.\n",
    "\n",
    "The following code snippet demonstrates how to use a specific 'load_summarize_chain' function to set up a summarization process using a Large Language Model (LLM) with a \"map_reduce\" chain type. It starts by loading a PDF from the given URL (\"https://arxiv.org/pdf/1706.03762\") using the 'PyPDFLoader'. The loaded document is then split into manageable parts ('load_and_split'). These parts are fed into the summarization chain ('chain.run(docs)'), which processes and condenses the content. Finally, the summarized content is displayed in markdown format directly within the output environment, ensuring that the formatting of the summary remains intact."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "wIVM9OwssPZm",
    "outputId": "fcdb3459-5cec-4a71-a8f9-60d636c9a550"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Reading: https://arxiv.org/pdf/1706.03762\n",
      "Reading: https://arxiv.org/pdf/1810.04805\n",
      "Reading: https://arxiv.org/pdf/2005.14165\n",
      "Reading: https://arxiv.org/pdf/1910.10683\n"
     ]
    }
   ],
   "source": [
    "urls = [\n",
    "  \"https://arxiv.org/pdf/1706.03762\",\n",
    "  \"https://arxiv.org/pdf/1810.04805\",\n",
    "  \"https://arxiv.org/pdf/2005.14165\",\n",
    "  \"https://arxiv.org/pdf/1910.10683\"\n",
    "]\n",
    "\n",
    "loaders = []\n",
    "\n",
    "chain = load_summarize_chain(llm, chain_type=\"map_reduce\")\n",
    "\n",
    "for url in urls:\n",
    "  print(f\"Reading: {url}\")\n",
    "  loader = PyPDFLoader(url)\n",
    "  loaders.append(loader)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "EiuExFf4Z50G"
   },
   "source": [
    "Next we load embeddings from the four documents into [ChromaDB](https://www.trychroma.com/). These embeddings will allow the prompt to be augmented with information from the PDF papers that we loaded."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "id": "ms1lqPY_5HTG"
   },
   "outputs": [],
   "source": [
    "from langchain.indexes import VectorstoreIndexCreator\n",
    "from langchain.document_loaders import PyPDFDirectoryLoader\n",
    "from langchain_openai import OpenAIEmbeddings\n",
    "from langchain_community.vectorstores.inmemory import InMemoryVectorStore\n",
    "\n",
    "embeddings_model = OpenAIEmbeddings()\n",
    "index = VectorstoreIndexCreator(embedding=embeddings_model,vectorstore_cls=InMemoryVectorStore).from_loaders(loaders)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "s3T-ADPcap7Q"
   },
   "source": [
    "Now that the embeddings are loaded, we can query the model for information that is only contained in the documents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 35
    },
    "id": "j6C1ACgT5is6",
    "outputId": "1d741d55-d5ff-4343-f555-ca4d2f0c5494"
   },
   "outputs": [
    {
     "data": {
      "application/vnd.google.colaboratory.intrinsic+json": {
       "type": "string"
      },
      "text/plain": [
       "'The left figure in Figure 2 demonstrates Scaled Dot-Product Attention.'"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query = \"Which figure demonstrates Scaled Dot-Product Attention?\"\n",
    "\n",
    "index.query(query,llm=llm)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ErkO7KR9apEY"
   },
   "source": [
    "## Limitations of RAG\n",
    "\n",
    "Language Model Retrieval-Augmented Generation (LLM RAG) combines the capabilities of large language models with external data retrieval mechanisms. This approach enhances language models' performance by providing access to specific, often proprietary, data that the pre-trained model's general knowledge might lack. However, the effectiveness of LLM RAG diminishes significantly when the augmented data is already common knowledge and inherently included in the foundation model.\n",
    "\n",
    "LLM RAG excels in scenarios where users require proprietary or highly specialized information. In fields such as finance, law, or technical industries, specific data sets, reports, or documents are essential for accurate responses. RAG's unique ability to pull in this external data ensures highly precise and contextually relevant answers. While the foundation model is extensively trained on a broad array of publicly available information, it may lack the depth or latest updates required for these niche domains. Therefore, RAG's integration of proprietary data sets leads to more accurate and contextually enriched outputs.\n",
    "\n",
    "It's important to consider that when the data intended for augmentation is already common knowledge, the benefits of LLM RAG reduce considerably. The foundation model undergoes pre-training on vast amounts of data, encompassing a wide range of general knowledge topics. Consequently, when queries involve information that falls within this general scope, the foundation model generates accurate and informative responses without the need for external augmentation. In such cases, using RAG does not add significant value and can introduce unnecessary complexity and processing overhead.\n",
    "\n",
    "Moreover, retrieving data that the model already understands well leads to inefficiencies. The foundational model's extensive pre-training includes diverse texts, meaning that its internal knowledge base is typically sufficient for common knowledge queries. Therefore, relying on RAG in these instances does not leverage the model's strengths effectively and detracts from the system's overall efficiency.\n",
    "\n",
    "In conclusion, while LLM RAG highly benefits the augmentation of language models with proprietary or highly specialized data, its efficacy wanes when dealing with common knowledge. The foundation model's comprehensive training already covers a vast expanse of general information, rendering RAG augmentation superfluous in such contexts. Therefore, strategically employ RAG, which offers the most significant enhancement in areas requiring access to proprietary data that the foundational model might only partially encompass."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "j7K2Fzfp3Vux"
   },
   "source": [
    "# Module 6 Assignment\n",
    "\n",
    "You can find the first assignment here: [assignment 6](https://github.com/jeffheaton/app_generative_ai/blob/main/assignments/assignment_yourname_class6.ipynb)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "NemgD3ZI3Vux"
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3.11 (genai)",
   "language": "python",
   "name": "genai"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.8"
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
